(N 

o 
u 



^ 



A Proof of Green's Conjecture Regarding the Removal Properties of 

Sets of Linear Equations 



o 

iH . Asaf Shapira 

< 



Abstract 

A system of £ linear equations in p unknowns Mx = & is said to have the removal property if 
every set S C {1, . . . , 71} which contains o{nP~^) solutions of Mx = b can be turned into a set 5" 
containing no solution of Mx = b, by the removal of o{n) elements. Green [GAFA 2005] proved 
that a single homogenous linear equation always has the removal property, and conjectured that 
every set of homogenous linear equations has the removal property. We confirm Green's conjecture 
by showing that every set of linear equations (even non-homogenous) has the removal property. 



CN ■ 1 Introduction 
> 

(~^ I The (triangle) removal lemma of Ruzsa and Szemeredi [T7], which is by now a cornerstone result 

2J i in combinatorics, states that a graph on n vertices that contains only o{n^) triangles can be made 

I I I triangle free by the removal of only o{n'^) edges. Or in other words, if a graph has asymptomatically 

^D ' few triangles then it is asymptotically close to being triangle free. While the lemma was proved 

00 

f~^ I in [T^ for triangles, an analogous result for any fixed graph can be obtained using the same proof 

idea. Actually, the main tool for obtaining the removal lemma is Szemeredi's regularity lemma for 

k> i graphs [19], another landmark result in combinatorics. The removal lemma has many applications in 



different areas like extremal graph theory, additive number theory and theoretical computer science. 
Perhaps its most well known application appears already in [T7] where it is shown that an ingenious 
application of it gives a very short and elegant proof of Roth's Theorem, which states that every 
S ^ [n] = {1, . . . ,n} of positive density contains a 3-term arithmetic progression. 

Recall that an r-uniform hypergraph H = (V, E) has a set of vertices V and a set of edges E, 
where each edge e G E contains r distinct vertices from V. So a graph is a 2-uniform hypergraph. 
Szemeredi's famous theorem jl8j extends Roth's theorem by showing that every S C [n] of positive 
density actually contains arbitrarily long arithmetic progressions (when n is large enough) . Motivated 
by the fact the a removal lemma for graphs can be used to prove Roth's theorem, Frankl and Rodl 
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[6] showed that a removal lemma for r-uniform hypergraphs could be used to prove Szemeredi's 
theorem on (r + l)-term arithmetic progressions. They further developed a regularity lemma, as 
well as a corresponding removal lemma, for 3-uniform hypergraphs thus obtaining a new proof of 
Szemeredi's theorem for 4-term arithmetic progressions. In recent years there have been many 
exciting results in this area, in particular the results of Gowers [8] and of Nagle, Rodl Schacht and 
Skokan |141 I15j. who independently obtained regularity lemmas and removal lemmas for r-uniform 
hypergraph, thus providing alternative combinatorial proofs of Szemeredi's Theorem |18) and some of 
it generalizations, notably those of Furstenberg and Katznelson [7]. Tao [2D] later obtained another 
proof of the hypergraph removal lemma and of its many corollaries mentioned above. For more 
details see pfTT]. 

In this paper we will use the above mentioned hypergraph removal lemma in order to resolve a 
conjecture of Green [TU] regarding the removal properties of sets of linear equations. Let Mx = 6 be 
a set of linear equations, and let us say that a set of integers S is (M, 6)-free if it contains no solution 
to Mx = b, that is, if there is no vector x, whose entries all belong to S, which satisfies Mx = b. 
Just like the removal lemma for graphs states that a graph that has few copies of H should be close 
to being H-fiee, a removal lemma for sets of linear equations Mx = b should say that a subset of 
the integers [n] that contains few solutions to Mx = b, should be close to being (M, 6)-free. Let us 
start be defining this notion precisely. 

Definition 1.1 (Removal Property) Let M be an £xp matrix of integers and let b G N . The set 
of linear equations Mx = b has the removal property if for every 6 > there is an e = e{6, M, 6) > 
with the following property: if S CI [n] is such that there are at most en^"^ vectors x G S^ satisfying 
Mx = b, then one can remove from S at most 6n elements to obtain an {M,b)-free set. 

We note that in the above definition, as well as throughput the paper, we assume that the i x p 
matrix M of a set of linear equations has rank i. 

Green |T0] has initiated the study of the removal properties of sets of linear equations. His main 
result was the following: 

Theorem 1 (Green [10] ) Any single homogenous linear equation has the removal property. 

The main result of Green actually holds over any abelian group. To prove this result. Green devel- 
oped a regularity lemma for abelian groups, which is somewhat analogous to Szemeredi's regularity 
lemma for graphs [19] . Although the application of the group regularity lemma for proving Theorem 
dj was similar to the derivation of the graph removal lemma from the graph regularity lemma, the 
proof of the group regularity lemma was far from trivial. One of the main conjectures raised in [10] 
is that a natural generalization of Theorem [1] should also hold (Conjecture 9.4 in [lOj). 

Conjecture 1 (Green |10| ) Any system of homogenous linear equations Mx = has the removal 
property. 



We note that besides being a natural generalization of Theorem [H Conjectured] was also raised 
in [To] with relation to a conjecture of Bergelson, Host, Kra and Ruzsa [3] regarding the number of 
A;-term arithmetic progressions with a common difference in subsets of [n] . See Section H] for more 
details. 

Very recently, Krai', Serra and Vena [13] gave a surprisingly simple proof of Theorem [U which 
completely avoided the use of Green's regularity lemma for groups. In fact, their proof is an elegant 
and simple application of the graph removal lemma mentioned earlier and it actually extends Theorem 
[T] to any single non- homogenous linear equation over non-abelian groups. Krai', Serra and Vena 
[13] also show that Conjecture [T] holds when M is a 0/1 matrix, which satisfies certain conditions. 
But these conditions are not satisfied even by all 0/1 matrices. In another recent result, which 
was obtained independently of ours, Candela [5] showed that Conjecture [1] holds for every pair of 
homogenous linear equations, as well as for every system of homogenous equations in which every i 
columns of M are linearly independent. See more details in Subsection 12.11 

In this paper we confirm Green's for every homogenous set of linear equations. In fact, we prove 
the following more general result. 

Theorem 2 (Main Result) Any set of linear equations (even non homogenous) Mx = b has the 
removal property. 

The rest of the paper if organized as follows. In the next section we give an overview of the 
proof of Theorem [2l As we show in that section. Theorem [2] also holds over any finite field, that is 
when 5" C F„, where F„ is the field of size n. In fact it is easy to modify the proof so that it works 
over any field, but we will not do so here. The proof of Theorem [2] has two main steps: the first 
one, described in Lemma 12.31 applies the main idea from [1^ in order to show that if a set of linear 
equations can be "represented" by a hypergraph then Theorem [2] would follow from the hypergraph 
removal lemma. So the second, and most challenging step of the proof, is showing that every set of 
linear equations can be represented as a hypergraph. The proof of this result, stated in Lemma 12.4^ 
appears in Section (3] In Section [J] we give some concluding remarks and discuss some open problems. 

2 Proof Overview 

It will be more convenient to deduce Theorem [2] from an analogous result over the finite field F„ 
of size n (for n a prime power). In fact, somewhat surprisingly, we will actually need to prove a 
stronger claim than the one asserted in Theorem [2l This more general variant, stated in Theorem 
[3l allows each of the variables Xi to have its own subset Si C [n]. We note that a proof of this 
variant of Theorem [2] for the case of a single equation was already proved in [TU] and [13] , but in 
those papers it was not necessary to go through this more general result. As we will explain later 
(see Claims 13.11 and 13. 3p , the fact that we are considering a more general problem will allow us to 



overcome some degeneracies in the system of equations by allowing us to remove certain equations. 
This manipulation can be performed when one considers the generalized removal property (defined 
below) but there is no natural way of performing these manipulations when considering the standard 
removal property. Therefore, proving this extended result is essential for our proof strategy. 

In what follows and throughout the paper, whenever x is a vector, Xi will denote its i''^ entry. 
Similarly, ii xi, . . . ,Xp are elements in a field, then x will be the vector whose entries are xi, . . . , Xp. 
We say that a collection ofp subsets Si, . . . , 5p C F„ is (M, 6)-free if there are no xi & Si, . . . ,Xp & Sp 
which satisfy Mx = b. 

Definition 2.1 (Generalized Removal Property over Finite Fields) Let¥n be the field of size 
n, let M be an £xp matrix over¥n and let 6 G F„. The system Mx = b is said to have the generalized 
removal property if for every 6 > there is an e = €{6,p) > such that i/ Si, . . . , Sp C F„ contain 
less than en^~^ solutions to Aix = b with each x-i G Si, then one can remove from each Si at most 
6n elements to obtain sets S[, . . . , S' which are (M, b)-free. 

By taking all sets Si to be the same set S we, of course, get the standard notion of the removal 
property from Definition II. II so we may indeed work with this generalized definition. We will deduce 
Theorem [2] from the following theorem. 

Theorem 3 Every set of linear equations Mx = b over a finite field has the generalized removal 
property. 

In this paper we apply the hypergraph removal lemma in order to resolve Green's conjecture. In 
fact, for the proof of Theorem[3]we will need a variant of the hypergraph removal lemma which works 
for colored hypergraphs. But let us first recall some basic definitions. An r-uniform hypergraph is 
simple if it has no parallel edges, that is, if different edges contain different subsets of vertices of 
size r. We say that a set of vertices [/ in a r-uniform hypergraph H = (VhjEh) span a copy 
of an r-uniform hypergraph K = {Vk,Ek) if there is an injective mapping (p from Vk to U such 
that if vi, . . . ,Vr form an edge in K then 0(f i), . . . , 4'i'Vr) form an edge in [/ C Vh- We say that 
a hypergraph is c-colored if its edges are colored by {1, . . . , c}. If K and H are c-colored, then U 
is said to span a colored copy of K if the above mapping (p sends edges of K of color i to edges of 
H (in U) of the same color i. We stress that the coloring of the edges does not have to satisfy any 
constraints that are usually associated with edge colorings. Finally, the number of colored copies of 
X in if is the number of subsets U C Vjj of size IV^-I which span a colored copy of K. 

The following variant of the hypergraph removal lemma is a special case of Theorem 1.2 in [2]1^ 



^As noted to us by Terry Tao, this variant of the hypergraph removal lemma can probably be extracted from the 
previous proofs of the hypergraph removal lemma [51 1141 031 120) . just like the colored removal lemma for graphs can 
be extracted from the proof of the graph removal lemma, see [12) . 



Theorem 4 (Austin and Tao [2j) Let K he a fixed r -uniform c-colored hypergraph on k vertices. 
For every 5 > there is an e = e{5, k) > such that if H is an r -uniform c-colored simple hypergraph 
with less than en colored copies of K , then one can remove from H at most 6n^ edges and obtain a 
hypergraph that contains no colored copy of K . 

In order to use Theorem S] for the proof of Theorem [3l we wih need to represent the solutions of 
Mx = 6 as colored copies of a certain "small" hypergraph K va. a certain "large" hypergraph H. The 
following notion of hypergraph representability specifies the requirements from such a representation 
that suffice for allowing us to deduce Theorem [3] from Theorem SI 

Definition 2.2 (Hypergraph Representation) Let F^ he the field of size n, let M he an I x p 

matrix over F„. The system of linear equations Mx = b is said to be hypergraph representable if 
there is an integer r = r[M, b) < p^ and an r -uniform p-colored hypergraph K with k = r — 1+p — I 
vertices and p edges, such that for any Si, . . . , Sp C [n] there is an r -uniform hypergraph H on kn 
vertices which satisfies the following: 

1. H is simple and each edge with color i is labeled by one of the elements of Si. 

2. If xi € Si, . . . ,Xp G Sp satisfy Mx = b then H contains rJ'^"^ colored copies of K, such that 
their edge with color i has label Xj. These colored copies of K should also be edge disjoint. 

3. If Si, . . . , Sp contain T solutions to AIx = b with xi € Si then H contains Tn^^^ colored copies 
ofK. 

The following lemma shows that a hypergraph representation can allow us to prove Theorem [3] 
using the hypergraph removal lemma. 

Lemma 2.3 If Mx = b has a hypergraph representation then it has the generalized removal property. 

Proof: Suppose Mx = 6 is a system of (. linear equations in p unknowns. Let Si,...,Sp he p 
subsets of F„ and let H be the hypergraph guaranteed by Definition 12.21 We claim that we can take 
e{5,p) in Theorem 12.11 to be the value e = e{6/pk^ ,k) from Lemma HI Note that r,k < 2p^ so this 
still implies that e is only a function of 6 and p. Indeed, if Si, . . . , Sp contain only en^~^ solutions to 
Mx = b then by item 3 of Definition 12.21 we get that H contains at most en^"^ • n''"^ = en^ colored 
copies of K. As H is simple, we can apply the removal lemma for colored hypergraphs (Lemma U]) 
to conclude that one can remove a set E of at most :^{knY = -n^ edges from H and thus destroy 
all the colored copies oi K m. H (recall that H has kn vertices). 

To show that we can turn Si, . . . ,Sp into a collection of (M, 6)-free sets by removing only 6n 
elements from each Si, let us remove an element s from Si if E contains at least vJ'~^ jp edges that 
are colored with i and labeled with s. As each edge has one label (because H has no parallel edges), 



and \E\ < -nJ' this means that we remove only 5n elements from each Si. To see that we thus 
turn 5i, . . . , S'p into (M, 6)-free sets, suppose that the new sets 5j, . . . , 5^ still contain a solution 
si G 5*1, ..., Sp € S'p to Mx = b. By item 2 of Definition 12.21 this solution defines rf~^ edge disjoint 
colored copies of K in H, with the property that in every colored copy, the edge with color i is labeled 
with the same element Si a Si. As E must contain at least one edge from each of these colored copies 
(as it should destroy all such copies), there must be some 1 < i < p for which E contains at least 
rf~^/p edges that are colored i and labeled with Sj. But this contradicts the fact that Sj should have 
been removed from Si. ■ 

We note that the above lemma generalizes a similar lemma for the case of representing a single 
equation using a graph, which was implicit in [13]. In fact, as we have mentioned earlier, [13] also 
show that a set of homogenous linear equations Mx = 0, with M being a 0/1 matrix, that satisfies 
certain conditions also has the removal lemma. One of these conditions essentially says that the 
system of equations is graph representable. However, there are even some 0/1 matrices for which 
Mx = is not graph representable (in the sense of [13] ) . Lemma 12.41 below shows that any set of 
linear equations has a hypergraph representation. This lemma is proved in the next section and it is 
the most challenging part of this paper. 

Lemma 2.4 Every set of linear equations Mx = b over a finite field is hypergraph representable. 

From the above two lemmas we get the following. 

Proof of Theorem [2} Immediate from Theorem [3] and Lemma 12.31 

As we have mentioned before. Theorem 12.41 is now an easy application of Theorem [3l 

Proof of Theorem [2} Given a set of linear equations Mx = b in p unknowns, let c be the 
maximum absolute value of the entries of M and b. Given an integer n let q = q{n) be the smallest 
prime larger than cp^n. It is well known that q < 2cp^n (in fact, much better bounds are known). It 
is clear that for a vector x G [n]^ we have Mx = b over M if and only if Mx = b over Fg. So if Mx = b 
has o{n^~ ) solutions with Xi € Si over M, it also has o{q^~ ) solutions with Xi ^ Si CI Fg over Fg. By 
Theorem [3] we can remove o{q) elements from each Si and obtain sets S'^ that are (M, 6)-free. But as 
q = 0{n) we infer that the removal of the same o{q) = o{n) elements also guarantees that the sets 
are (M, 6)-free over M. ■ 

2.1 Overview of the Proof of Lemma 12.41 

Let us start by noting that Lemma 12.41 for the case of a single equation was (implicity) proven in 
|13j . where they show that one can take r = 2, in other words, they represent a single equation 



as a graph K, in a graph H. Actuahy, the graph K in the proof of [13] is a cycle of length p. 
The proof in [13] is very short and elegant, and we recommend reading it to better understand 
the intuition behind our proof (although this paper is, of course, self contained). Another related 
result is the proof of Szemeredi's theorem [IS] using the hypergraph removal lemma [6J, which 
can be interpreted as (essentially) showing that the set of p — 2 linear equations which define a 
p-term arithmetic progressioiQ are hypergraph representable with K being the complete {p — 1)- 
uniform hypergraph of size p. "Interpolating" these two special cases of Lemma 12.41 suggests that 
a hypergraph representation of a set of i linear equations in p unknowns should involve an (i + 
l)-uniform hypergraph K of size p. And indeed, we initially found a (relatively) simple way to 
achieve this for p — 2 equations in p unknowns, thus extending the representability of the arithmetic 
progression set of linear equations. 

However, somewhat surprisingly, when 1 < i < p — 2 the situation becomes much more com- 
plicated and we did not manage to find a simple representation along the lines of the above two 
cases. The problem with trying to extend the previous approaches to larger sets of equations is that 
obtaining all the requirements of Definition 12.21 turns out to be very complicated when M has a set 
of i columns that are not linearly independent. Let us mention again that Candela [S] has recently 
considered linear equations Mx = in which every i columns are linearly independent, and showed 
that Conjectured] holds in these cases. 

The way we overcome the above complications is by using a representation involving hypergraphs 
of a much larger degree of uniformity (that is, larger edges), which is roughly the number of non-zero 
entries of M after we perform certain manipulations on it. We note that specializing our proof to 
either the case £ = 1 or to the case i = p — 2 does not give proofs that are identical to the ones 
(implicit) in [6] or [13]. For example, our proof for the case of a single equation in p unknowns uses 
a (p — l)-uniform hypergraph, rather than a graph as in |13| . 

So let us give a brief overview of the proof. We need to find a small hypergraph K with p edges, 
whose copies, within another hypergraph H, will represent the solutions to Mx = b. Each edge of 
H, and therefore also K, will have a color 1 < i < p and a label s a Si. The system Mx = b has p 
unknowns and K has p edges and it may certainly be the case that all the entries of M are non-zero. 
It is apparent that using all the edges of K to "deduce" a linear equation of Mx = 6 is not a good 
idea because in that way we will only be able to extract one equation from a copy of K and we need 
to extract i such equations. Therefore, we will first "diagonalize" an i x i sub-matrix of M to get an 
equivalent set of equations (which we still denote by Mx = b) which has the property that p — i of its 
unknowns xi, . . . , Xp_£ (can) appear in all equations and the rest of the £ unknowns Xp_£+i, . . . ,Xp 
each appear in precisely one equation. This suggests the idea of extracting equation i from (some 
of) the edges corresponding to xi, . . . ,Xp-i and one of the edges corresponding to a;p_^+i, . . . ,Xp. 
The hypergraph K first contains p — i edges that do not depend on the structure of M. The other 



^These linear equations are xi + X3 = 2a;2, X2 + X4 = 2xs, . . . , a;p_2 + Xp — 2xp_i. 



I edges do depend on the structure of M and use the previous p — I edges in order to "construct" 
the equations of Mx = h. The way to think about this is that for any copy of i^ in if the first p — ^ 
edges will have a special vertex that will hold a value from Si (this will be the vertex in one of the 
sets [/i, . . . , f/p_^ defined in Section [2D . The other i. edges will include some of these special vertices, 
depending on the equation we are trying to build. The way we will deduce an equation from a copy 
of iT in if is that we will argue that the fact that two edges have a common vertex means that a 
certain equation holds. See Claim WM 

But there is another complication here because the linear equation we obtain in the above process 
will contain many other variables not from the sets Sj, which will need to vanish from such an 
equation, in order to allow us to extract the linear equations we are really interested in. The reason 
for these "extra" variables is that H needs to contain nJ'~^ edge disjoint copies of K for every solution 
of Mx = b. Hence, an edge of H will actually be parameterized by several other elements from F„ 
(these are the elements xi,. . . ,Xr-i that are used after Claim [3T2|) . So we will need to make sure 
that these extra variables vanish in the linear equation which we extract from a copy oi K. To make 
sure this happens we will need to carefully choose the vertices of each edge within H. 

A final complication arises from the fact that while we need H to contain relatively few copies of 
H, we also need it to contain many copies edge disjoint copies of H for every solution of Mx = b. To 
this end we will think of each vertex of ii as a linear equation and we will want the linear equations 
corresponding to the vertices of an edge to be linearly independent. The reason why it is hard to 
prove Lemma 12.41 using an (i + l)-uniform hypergraph (as the results of [13j and [B] may suggest) is 
that it seems very hard to obtain all the above requirements simultaneously. The fact that we are 
considering hypergraphs with a larger degree of uniformity will allow us (in some sense) to break the 
dependencies between these requirements. 

3 Proof of Lemma 12.41 

Let M he an £ X p matrix over F„ and 6 € F„. We will first perform a series of operations on M 
and b which will help us in proving Lemma 12.41 For convenience, we will continue to refer to the 
transformed matrix and vector as M and b. Suppose, without loss of generality, that the last i 
columns of M are linearly independent. We can thus transform M (and accordingly also b) into an 
equivalent set of equations in which the last i columns form an identity matrix. For a row Mi of M 
let rrii be the largest index 1 < j < p — i for which Mi is non-zero. Let Wi denote the set of indices 
1 < j < rrii — l for which Mij is non-zero. Therefore, Mi has \Wi\ + 2 non-zero entries. We will need 
the following claim, in which we make use of the fact that we are actually proving that every set of 
equations has the generalized removal property and not just the removal property. 

Claim 3.1 Suppose that every set of i — 1 equations in p — 1 unknowns over F„ has the generalized 
removal property. Suppose that the matrix M defined above has a row with less than 3 non-zero 



entries. Then Mx = b has the generalized removal property as well. 

Proof: Suppose that (say) the first row of M has at most 2 non-zero entries. If this row has two 
non-zero elements then we can assume without loss of generality that it is of the form xi = b — a- Xj 
where p — i + 1 < j < p. But then we can get an equivalent set of linear equations M'x = b' by 
removing the first row from M, removing the column in which Xj appears (because Xj does not appear 
in other rows), removing the first entry of b and updating Si to he S'^ = Si (^ {b — a ■ s : s G Sj}. 
We thus get an instance M'x = b' with i — 1 equations and p — 1 unknowns, hence we can use the 
assumption of the claim because: (i) The number of solutions of Mx = b with Xi £ Si is precisely the 
number of solutions of M'x = b' with xi £ S'i,X2 G 5*2, . . . ,Xj_i € S'j_i,Xj+i € Sj^i, . . . ,Xp € Sp 
(ii) if we can remove 5n elements from each of the sets of the new instance and thus obtain sets with 
no solution of M'x = b' then the removal of the same elements from the original sets Si would also 
give sets with no solution of Mx = b. 

If the first row of M has just one non-zero entry, then this equation is of the form Xj = b for 
some p — i + l<j<p and 6 G F,„. If 6 ^ Sj then the sets contain no solution to Mx = b and there 
is nothing to prove. If b G Sj then the number of solutions to Mx = 6 is the number of solutions of 
the set of equations M'x = b' where M' is obtained by removing the row and column to which Xj 
belongs and by removing the first entry of b. As in the previous case we can now use the assumption 
of the claim. ■ 

Claim 13.11 implies that we can assume without loss of generality that none of the sets Wi , . . . , We 
is empty, because if one of them is empty then the corresponding row of M contains less than 3 
non-zero entries. In that case we can iteratively remove equations from M until we either: (i) get 
a set of linear equations in which none of the rows has less than 3 non-zero entries, in which case 
we can use the fact that the result holds for such sets of equations as we will next show, or (ii) we 
get a single equation with only 2 unknowns with a non-zero coefficienlo. It is now easy to see that 
such an equation has the removal property. Indeed, suppose the equation has p unknowns and only 
xi and X2 have a non-zero coefficient. So the equation is ai ■ xi + a2 • X2 + X]f=3 • Xj = 6. In this 
case the number of solutions to the equation from sets Si, . . . ,Sp is the number of solutions to the 
equation aixi + 02X2 = b with xi G Si,X2 G 5*2 multiplied by nf=3 \'^i\- Therefore, if Si, . . . , Sp 
contain o{nP~^) solutions, then either (i) one of the sets S3, . . . , 5'p is of size o{n), so we can remove 
all the elements from this set, or (ii) Si,S2 contain o{n) solutions to ai ■ xi + a2 ■ X2 = b, but in 
this case, for every solution (si, S2) we can remove si from Si. In either case the new sets S'l, . . . , S'p 
contain no solution of the equation, as needed. 

We now return to the proof of Lemma [2.4l with the assumption that none of the sets Wi is empty. 
Let us multiply each of the rows of M by M~^, so that for every 1 < i < £ we have Mi^rni = 1- For 



^Note that this process can result in having unknowns with a zero coefficient in aU the remaining equations. 



every 1 < i < £ let di € {p — i + 1, . . . ,p} denote the index of the unique non-zero entry of Mj within 
the last i columns of M. Using the notation which we have introduced thus far, the system of linear 
equations Mx = b can be written as the set of i equations Li, . . . , L^, where Lj is the equation 

Xm, + Mi^d, ■Xd,+ ^ Mij ■ Sj = bi . (1) 

Let us set 

r = l+ Y^ \Wi\. 

Observe that as mentioned in the statement of the lemma, we indeed have r < p^. 

We now define an r-uniform p-colored hypergraph K, which will help us in proving that Mx = b 
is hypergraph representable as in Definition 12.21 The hypergraph K has k = r — 1+p — i vertices 
which we denote by ui, . . . , Vr-i,ui, . . . , Up-i. As for K's edges, it first contains p — i edges denoted 
ei, . . . , ep-i, where ej contains the vertices f i, . . . , Vr-i,Ui. Note that these edges do not depend on 
the system Mx = b. As we will see later, these edges will help us to "build" the actual representation 
of the linear equations of Mx = b. So in addition to the above p — £ edges, K also contains i edges 
/p_£+i, . . . , fp, where edge fa^ wilO represent (in some sense) equation Lj, defined in ([1]). To define 
these i edges it will be convenient to partition the set [r — 1] into i subsets Ii,. . . ,Ii such that /i 
contains the numbers 1, . . . , \Wi\, and I2 contains the numbers \Wi\ + 1, . . . , \Wi\ + 11^21 and so on. 
With this partition we define for every 1 < i < i edge fa^ to contain the vertices {vi : i G [r — 1] \/j}, 
the vertices {uj : j € Wi} as well as vertex Um^. Note that as |/j| = \Wi\ the hypergraph K is 
indeed r-uniform. As for the coloring of the edges of K, for every 1 < i < p — i edge ej is colored i 
and for every p — i + l<:di<p edge fa^ is colored di . 

Before defining the hypergraph H we need to define p — £ vectors a^, . . . , a^~^ G I^n~^ which we 
will use when defining H. We think of a^, . . . , a^"^ as the p — £ rows oiap — £xr — 1 matrix A. 
Furthermore, for every 1 < i < p — £ let Ai he the sub-matrix of A which contains the columns whose 
indices belong to Ii (which was defined above). We now take the (square) sub- matrix of Ai which 
contains the rows whose indices belong to Wi to be the identity matrix (over F„). More precisely, if 
the elements of Wi are ji < J2 < • • • < j\Wi\ then A'- =1 for every 1 < g < \Wi\, and otherwisqj. 
For future reference, let's denote by A'^^ this square sub-matrix of Ai. We finally set row rrii of Ai to 
be the vector whose g entry is —Mij^, where as above jg is the g element of Wi. If Ai has any 
other rows besides the ones defined above, we set them to 0. As each column of A belongs to one of 
the matrices Ai we have thus defined A and therefore also the vectors a^, . . . , a^~ . 

Let us make two simple observations regarding the above defined vectors which we will use later. 



^Note that we are using the fact that d\, . . . ,di axe distinct numbers in {p ~ (. + 1, . . . ,p}. 
^Note that the second index of A'j _g refers to the column number within Ai, not A. 
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First, lei 1 < i < £ and t G li and suppose t is the g element of /j. The: 

^ al ■ Mij = (^i)i„9 • Mij^ = Mij^ = -{Ai)m,,g = -aT' , (2) 

where the first equahty is due to the fact that the only non-zero entries within column g of Ai and 
the rows from Wi appears in row jg. The second equality uses the fact that this entry is in fact 1. 
The third equality uses the definition of row rrii of Ai . 
The second observation we will need is the following. 

Claim 3.2 For 1 < i < i, let Bi he the following r — 1 x r — 1 matrix: for every j G [r — 1] \ /j we 
have {Bi)jj = 1 and {Bi)j^t = 1 for t ^ j. The other |/j| rows of Bi are the \Wi\ (= \Ii\) vectors 
{a* : t € Wi}. Then, for every 1 < i < i the matrix Bi is non-singular. 



I' 



Proof: To show that Bi is non-singular it is clearly enough to show that its |/j| x |/j| minor B'^ 
which is determined by /j, is non-singular. But observe that this fact follows from the way we have 
defined the vectors a^, . . . , a^~^ above because B'^ is just A'-, which is in fact the identity matrix. ■ 

We are now ready to define, for every set of subsets 5i, . . . , 5p C F„, the hypergraph H which will 
establish that Mx = 6 is hypergraph representable. The vertex set of H consists of A: (= r — 1 -\-p — i) 
disjoint sets Vi, . . . , Vj—i, Ui, . . . , Up^£, where each of these sets contains n vertices and we think of 
the elements of each of these sets as the elements of F„. As for the edges of H, we first put for 
1 < i < p — i and every choice of r — 1 vertices xi G Vi, . . . , Xr~i G K--1 and element s G Si, an edge 
with color i and label s, which contains the vertices xi, . . . , Xr-i as well as vertex y (^ Ui, where 

r-l 

and the values Oij were defined above. These edges will later play the role of the edges ei, . . . , ep-i 
of K defined above. Note that these edges are defined irrespectively of the set of equations Mx = b. 
We now define the edges of H which will "simulate" the linear equations of Mx = b. For every 
1 < i < i, and for every choice of an element s G Sd^, for every choice of r — 1 — |/j| vertices 
{xt G Vi : t £ [r — l]\Ii} and for every choice of |Wj| (= |/j|) vertices {yj G Uj : j G Wi} we have 
an edge with color di and label s, which contains the vertices {xt : t G [r — 1] \/j} and {yj : j G Wi} 
as well as vertex y G Urm, where 

y = bi-Mi^drs-Y,^i,j-yj+ Yl ^t • (or + 2^ a^ Mij) . (4) 

jew^ ie[r-i]\/, jew^ 

Let us first note that as required by Lemma 12.41 each edge of H has a color i and is labeled by 
an element s G Si. In fact, for each 1 < i < p and for each s G Si, the hypergraph H has n'"~^ edges 
that are colored i and labeled with s. We start with the following claim. 



^Note that t is an index of a column of A, while g is an index of a column of Ai 

11 



Claim 3.3 H is a simple hypergraph, that is, it contains no parallel edges. 

Proof: Observe that edges of H with different colors have a single vertex from a different subset of 
r of the sets Vi, . . . , Vr^i, Ui, . . . , Up-i. Indeed, edges with color 1 < i < p — i contain a vertex from 
each of the sets Vi, . . . , T^-i and another vertex from Ui, while an edge with color p — i + l<di<p 
contains vertices from the sets {Vt : t G [r — 1] \ /j} as well as vertices from some of the sets 
Ui, . . . , Up-i- Note that the sets Ii, . . . ,Ii are disjoint and non-empty, as none of the sets Wi is 
empty, a fact which (as noted previously) follows from Claim I3.1[ Observe that if Wi was empty, 
then edges with color di would have had parallel edges with color rrii. 

As for edges with the same color 1 < i < p — i, recall that they are defined in terms of a different 
combination of xi, . . . , Xr-i £ F„ and s £ Si. So if one edge is defined in terms of xi, . . . , Xr-i € F„ 
and s £ Si and another using x[, . . . ,x'j._i G F„ and s' £ Si then either (i) Xj 7^ x' for some 
l<:j<:r — lin which case the edges have a different vertex in Vj (ii) Xj = x' for all 1 < j < r — 1, 
implying that s ^ s' . Therefore the edges have a different vertex in Ui by the way we chose the 
vertex in this set in ^. 

The case of edges with the same color p — i + \ < di < p is similar. Recall that such edges are 
defined in terms of a different combination of {xt : t € [r — 1] \ Ij}, {i/j : j £ Wi} and s £ 5^-. So 
if one edge is defined in terms of {xt : t £ [r — l]\Ii}, {yj : j £ Wi} and s £ 5^. and another using 
{x[ : t £ [r — 1]\ Ii}, {y'- : j £ Wi} and s' £ S^^ then either (i) xt 7^ x[ for some t G [r — 1] \ /, 
in which case the edges have a different vertex in Vt (ii) yj 7^ y' for some j £ Wi, in which case the 
edges have a different vertex in Uj (iii) xt = x[ for all t £ [r — 1] \ /j, and yj = y'- for all j £ Wi, 
implying that s ^ s' and therefore the edges have a different vertex in Urm by the way we chose the 
vertex in this set in dH) and from the fact that Mi^^i 7^ 0. ■ 

The above claim establishes the first property required by Definition 12.21 and we now turn 
to establish the second and third. Fix arbitrary elements si £ Si, . . . ,Sp-e £ Sp-£. For every 
choice of r — 1 (not necessarily distinct) elements xi, . . . Xr-i £ F„, let K^ be the set of vertices 
xi £Vi,... , Xr-i £ Vr-i,yi £Ui,... , yp-£ £ Up-i, where for every I <j <p- i 

r-l 
Vj = Sj + ^al-xt . (5) 

t=i 

We will need the following important claim regarding the vertices of Kx. Getting back to the 
overview of the proof given in Subsection 12.11 this is where we extract one of the linear equations 
Li (defined above) from a certain combination of edges of a copy of K. We also note that the 
linear equation we "initially" obtain (see (0)) includes also the elements Xj, but the way we have 
constructed H guarantees that the Xj's vanish and we eventually get a linear equation involving only 
elements from the sets Si. We will then use this claim to show that H contains many edge disjoint 
copies of K when si, . . . , Sp-i determine a solution to Mx = b, and in the other direction, that H 
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cannot contain too many copies of H. For what follows we remind that reader that for 1 < i < i we 
have p — i + 1 < di < p and that for i < i' we have di ^ di' . Returning to the overview of the proof 
given in Subsection I2.H we are now going to use the fact that edges with colors di and rrii have a 
common vertex in Urm in order to deduce the linear equation Lj. 

Claim 3.4 Let 1 < i < I. Then the vertices {xt : t G [r — 1] \ /j} U {yj : j E Wj} U yrm span an 
edge (of color di) if and only if there is an element Sd^ G Sd^ such that {sj : j € Wi} U Sm^ U s^- 
satisfy equation Li (defined in U^). 

Proof: H contains an edge containing the vertices {xt : t S [r — 1] \ /j} U {yj : j € Wi} U ym.. if 
and only if (recall Q) there is an Sd^ G 5"^. such that 

ym, =h- M,^d, -Sd^-Y, M,j- ■yj+ J^ xt ■ (a™' +^4' M^,j) (6) 

j&W, telr-l]\h j&w. 

Using ([5|) this is equivalent to requiring that 

7 — 1 r— 1 

Sm, + ^ a™' • X4 = bi- Mi^d, ■ Sd, - ^ Mij ■ {sj + ^ a* • xt) 
t=l ji^W^ t=i 

+ E ^* ■ («"' +T.4- M.,,3) 

te[r-l]\/, jGWi 

r-1 / 

= bi- Mi^d, ■ Sd,- ^Yl ^^'^ ' ^J ~^^f ^ 4 ■ Mt,3 

idW^ t=i \j(iWi 

+ Y, Xf (a™» +Y.4- M^,J) 
te[r-l]\h j&w, 

= bi- Mi^d, ■ Sd,- ^ Mi J ■ Sj-^^Xf {^^ a{- Mi^j 
jaw, tei, \j€W, 

+ E 

te[r^i]\h 



Xf at' 



Using ([2|) in the last row above, we can write the above requirement as 

r—l r—1 

Sm, + ^ a)"" • Xi = 6j - Mi^d, ■ Sd,- ^ Mi J ■ Sj + 'Y a^' ■ xt , 
t=i jaWi t=i 

or equivalently that 

Sm, + Mi^d, • •5d, + X] ^*'-?' ■ ^J ^ ^* ' 

jew, 
which is precisely equation Lj. 
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For the next two claims, let us recall that we assume that the last t columns of M form a diagonal 
matrix. Therefore, a solution to Mx = 6 is determined by the first "p — I elements of x. 



X 



Claim 3.5 Suppose si, . . . ,Sp-i determine a solution si,...,Sp to Mx = b. Then, any set K. 
(defined above) spans a colored copy of K. In particular, for every solution si, . . . ,Sp to Mx = b, H 
has n^~^ colored copies of K, in which the edge of color i is colored with Sj. 

Proof: We claim that Kx spans a colored copy of K, where for every 1 < i < r — 1 vertex Vi of K 
is mapped to vertex Xi of H, and for every 1 < j < p — i vertex Uj of K is mapped to vertex pj of 
H. To see that the above is a valid mapping of the colored edges of K to colored edges of H, we first 
note that the way we have defined H in ^ and the vertices yi, . . . , Pp-e in ^ , guarantees that for 
every 1 < j < p — i we have an edge with color i which contains the vertices xi, . . . , Xr-i,yj. This 
is actually true even if si, . . . , Sp-£ do not determine a solution. 

As for edges with color p—^+1 < dj < p, the fact that the vertices {xi : t G [r — l]\/j}U{yj : jG 
W^j} U Pmi span such an edge follows from Claim [37^ because we assume that si, . . . , Sp-i determine 
a solution to Mx = b, so for every 1 < i < i there exists an element Sd^ G S^^ as required by Claim 
13. 4[ We thus conclude that xi, . . . , x^-i, yi, • • • , yp-e span a colored copy of K. Finally, note that 
by the way we have defined H, the edge of K^ which is colored i is indeed labeled with the element 
Si G Si. ■ 

Claim 3.6 If si, . . . , Sp-i determine a solution to Mx = b, then the n^~^ colored copies of K spanned 
by the sets K^ (defined above) are edge disjoint. 

Proof: Let us consider two colored copies Kx and Ky for some x ^ y (Claim [331 guarantees that 
Kx and Ky indeed span a colored copy of K). Clearly Kx and Ky cannot share edges with color 
I < i < p — i, because the vertices of such edges within Vi, . . . , Vr-i are uniquely determined by the 
coordinates of x and y. 

We now consider an edge of Kx with color di G {p — i + 1, . . . ,p}. Let ji < j2 < . . . < j\Wi\ be the 
elements of Wi, and let Bi be the matrix defined in Claim [3721 Recall that Bi satisfies the following^: 
(i) for j G [r — 1] \ /j we have {Bi)jj = 1 and {Bi)j^t = when t ^ j, and (ii) if j G li is the g*^ 
element of /j, then the j*^ row of B^ is the vector a^^ (where jg is the g^^ element of Wi). Let us also 
define an r — 1 dimensional vector c as follows: for every j G [?^ — 1] \ h we have Cj = 0, and for every 
j & Ii, if j is the g element of Ii then Cj = Sj^ . The key observation now is that the vertices of the 
edge whose color is dj G {p — i + 1, . . . ,p} within the r — 1 sets {Vj : j G [r — 1] \ /,} U {Uj : j G Wi} 
are given by BiX + c. More precisely, for every j G [r — 1] \ /j the vertex of the edge of color di within 
Vj is given by {BiX + c)j. Also, for every jg G Wj, if j G /« is the g^^ element of Ii, then the vertex 



^We remark that when we have defined the matrices Bt in Claim \3l2\ we did not "impose" the ordering of the rows 
that correspond to Wi as we do here, but this ordering, of course, does not affect the rank of Bi. 
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of this edge within Uj^ is given by {Bix + c)j. Claim [3^2] asserts that Bi is non-singular, so we can 
conclude that the edges with color di of K^ and Ky can share at most r — 2 of their r — 1 vertices 
within the sets {Vj : j € [r — 1] \ 7^} U {Uj : j € Wi}. So any pair of edges of color di can share at 
most r — 1 vertices, and therefore K^ and Ky are edge disjoint U- ■ 

Claim 3.7 // Si, ■ ■ ■ , Sp contain T solutions to Mx = b with Xi E Si then H contains Tn^'~^ colored 
copies of K. 

Proof: Recall that we assume that the last i columns of M form a diagonal matrix. Therefore, the 
number of solutions T to Mx = b is just the number of choices of si € Si, ... , Sp_£ G Sp^£ that can 
be extended to a solution of Mx = 6 by choosing appropriate values Sp-i+i € Sp-^+i, ■ ■ ■ ,Sp £ Sp. 
Therefore, it is enough to show that every colored copy of K in H is given by a choice of r — 1 vertices 
xi G Vi, . . . , Xr-i € Vr-i and a choice of p — £ elements si € Si, ... , Sp_£ G Sp_£ that determine a 
solution to Mx = b. So let us consider a colored copy of K in 77. This copy must contain edges with 
the colors 1, . . . ,p — i. By the way we have defined H this means that this copy must contain r — 1 
vertices xi £ Vi, . . . , X^-i G K-i as well as p — i vertices yi G Ui, . . . , Vp-e. G C/p_£. Furthermore, 
for 1 < j <p — a. we have 

r-l 

%■ = Sj + X] a* • 3;* (7) 

t=i 

for some choice of Sj G Sj . So the vertex set of such a copy is determined by the choice of xi , . . . , x^-i 
and si,...,Sp_^. Note that the set of vertices is just the set K^ defined before Claim [331 for 
xi, . . . , Xr-i and si, . . . , Sp-i. Therefore, we can apply Claim [37il on this set of vertices. 

So our goal now is to show that there are elements Sp_£+i, . . . , Sp which together with si, . . . , Sp-i 
form a solution of Mx = b. Consider any 1 < i < i. As the vertices at hand span a colored copy 
of K they must span an edge with color di. This edge musci contain the vertices {xt : t G 
[r — 1] \ /j} U {yj : j G Wj} U ym, • But by Claim [33] if these vertices span an edge (of color di) then 
there is an element Sd^ G S^^ such that {sj : j G Wi} U Sm^ U Sd^ satisfy equation Lj. As this holds 
for every 1 < i < £ we deduce that si, . . . , Sp satisfy Mx = b. ■ 

The proof of Lemma 12.41 now follows from Claims 13.31 13.51 13.61 and 13.71 

4 Concluding Remarks and Open Problems 

• Our removal lemma for sets of linear equations works over any field. For the special case of 
a single linear equation, Krai', Serra and Vena [13] (following Green |10J) proved a removal 



We note that the way we have defined H does not (necessarily) guarantee that edges of the same color cannot 
share r — 1 vertices. That is, edges of color i may share the vertex in the set (7^ . and r — 2 of the r — 1 vertices from 
the sets {Vj : j £ [r - 1] \ 7,} U {Uj : j £ Wi}. 

^Because only vertices from this combination of r of the sets Vi, . . . , K-i, Ui, . . . , Up-e spans an edge with color di. 
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lemma over any group. It is natural to ask if a similar removal lemma over groups, or even 
just abelian groups, also holds for sets of linear equations. 



• Green |10j used the regularity lemma for groups in order to resolve a conjecture of Bergelson, 
Host, Kra and Ruzsa [1|, which stated that every S C [n] of size 6n contains at least {6^ — 
o(l))n 3-term arithmetic progressions with a common difference. The analogous statement for 
arithmetic progressions of length more than 4 was shown to be false in |^4j. So the only case 
left open is whether any S C [n] of size 6n contains at least (5^ — o(l))n 4-term arithmetic 
progressions with a common difference. Part of the motivation of Green for raising Conjecture 
[T]was that it may help in resolving the case of the 4-term arithmetic progression. It seems very 
interesting to see if Theorem [2] can indeed help in resolving this conjecture. 

• Our proof of the removal lemma for sets of linear equations applies the hypergraph removal 
lemma. As a consequence, we get extremely poor bounds relating e and 6. Roughly speaking, 
the best current bounds for the graph removal lemma give that 6(e) grows like Tower(l/e), 
that is, a tower of exponents of height 1/e. For 3-uniform hypergraphs, the bounds are given 
by iterating the Tower function 1/e times, and so on. So on the one hand, the fact that we 
are using hypergraphs with a large degree of uniformity implies that the bounds we get are are 
extremely weak. On the other hand, as even the graph removal lemma gives bounds which are 
too weak for any reasonable application, this is not such a real issue to be concerned about. It 
may still be interesting, however, to see if one can prove Theorem [2] with a proof similar to the 
one given in [13] for the special case of a single equation. 

• Given the above discussion it it reasonable to ask for which sets of equations Mx = b one can 
get a polynomial dependence between e and 6. This seems to be a challenging open problem 
even for a single equation so let us focus on this case. For a linear equation L, let rL{n) denote 
the size of the largest subset of n which contains no (non-trivial) solution to L. Problems of 
this type were studied by Ruzsa [16]. A simple counting argument shows that if riiji) = v}~^ 
for some positive c, then (5(e) = 0{l/eY'^. However, characterizing the equations with this 
property seems like a very hard problem, see [16]. Furthermore, we do not even know if all the 
linear equations for which ri{n) = 'n}~°'^^' do not have a polynomial dependence between e and 
8. For example, we do not know if such a dependence exists for the linear equation j;i -|- 2:2 = x^ 
(for which rL[n) = Q{n)). 

But for at least some of these linear equations, we can rule out such a polynomial dependence 
as the following example shows. Consider the linear equation xi -I-X3 = 2x2, that is, the linear 
equation which defines a 3-term arithmetic progressioro. We claim that for this equation 
there is no polynomial relation between e and 5. Fix an e and let uq = no(e) be large enough 



"The argument can be extended to any linear equation in which one variable is a convex combination of the others. 
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so that every S C [n] of size en contains a 3-terni arithmetic progression. Roth's Theorem 
|16j states that such an n exists. Therefore, for every n > uq and for every S C [n] of size 
2en we have to remove at least en elements from S in order to destroy all 3-term arithmetic 
progressions. Let m be the largest integer for which [m] contains a subset of size 4em, containing 
no 3-term arithmetic progressions. The well known construction of Behrend [3] implies that 
m > (l/e)'^'°s(i/'^) for some absolute constant c. Let X be one such subset of [m]. For every 
n > no,lei S C. [n] be the set of integers with the property that in their base 2m representation, 
the least significant element belongs to X. Then clearly \S\ = n ■ 4^ = 2en and so one should 
remove at least en elements from S to destroy all 3-term arithmetic progressions. On the other 
hand if xi,X2,X3 € S form a 3-term arithmetic progression then as X C [ni], so do the least 
significant characters of xi,X2,X3, because there in no carry in the base 2m addition. But as 
these characters belong to X we get that they must be identical. Therefore, the number of 
3-term arithmetic progressions in S is [Sl^/m? < g'^'osi/^T^^^ implying that (5(e) < e'^^°^^''^. 

• The contrapositive version of our main result says that if one should remove en elements from 
S C [n] in order to destroy all solutions of Mx = b then S contains f{e)nP~^ solutions to 
Mx = b. The "analogous" result for graphs (or hypergraphs) is that if one should remove en^ 
edges from a graph G in order to destroy all the copies of H then G contains 6{e)n copies of H 
(where h is the number of vertices of H). The main result of [1] is an "infinite" version of the 
removal lemma for graphs, which states that if ?^ is a (possibly infinite) set of graphs, and if one 
should remove en"^ edges from G in order to destroy all the copies of all the graphs H G TC then 
for some H G TC, whose size h satisfies h < h[e), G contains 6{e)n copies of H. It seems natural 
to ask if there is a corresponding "infinite" removal lemma for sets of linear equations. More 
precisely, is it the case that for every (possibly infinite) set M. = {Mix = bi,M2X = 62, • • •} 
of sets of linear equations the following holds: if one should remove en elements from S C [n] 
in order to destroy all the solutions to all the sets of linear equations in A4 , then for some set 
of linear equations Mx = b £ M., with p < p{e) unknowns, S contains 6{e)nP~^ solutions to 
Mx = b. 
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